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METHOD AND APPARATUS FOR DETERMINING ONE OR MORE STATISTICAL 
ESTIMATORS OF CUSTOMER BEHAVIOUR 
Background of the Invention 

Field of the Invention 

This invention relates to a method and apparatus for determining one or more 
statistical estimators of customer behaviour. The invention is particularly related to, 
but in no way limited to, modelling customer behaviour using a Bayesian statistical 
hidden Markov model technique. 

Description of the prior art 

Businesses typically have records of customer transaction histories. These 
records contain information that is potentially very valuable to the business because 
it enables the business to analyse customer behaviour and use this "feedback" to 
help plan the future of the business. However, assessments of the available data 
only provide information about customer behaviour that has already occurred. This 
is a drawback because behaviour patterns typically change over time. For example, 
a customer who is at present not very profitable could become more profitable in the 
future. There is thus a need to predict the future behaviour of customers. 

One particular example concerns a business such as a bank which wishes to 
predict when a customer is likely to leave the bank. In that case such a prediction 
would be extremely advantageous because it allows the bank to take action such as 
to give incentives to the customer to prevent them from leaving. 

Bayesian statistical techniques have been used to "learn" or make predictions 
on the basis of a historical data set. Bayes' theorem is a fundamental tool for a 
learning process that allows one to answer questions such as "How likely is my 
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hypothesis in view of these data?" For example, such a question could be "How 
likely is a particular future event to occur in view of these data?" 
Bayes theorem is written as : 



P(H I data) oc P(data I H) • P(H) 

Because P(data) is unconditional and thus does not depend on H. 
The probability of H given the data, P(H/data) is called the posterior 
probability of H. The unconditional probability of H, P(H) is called the prior probability 
10 of H and the probability of the data given H, P(data/H) is called the likelihood of H. 
By using knowledge and experience about past data an assessment of the prior 
probability can be made. New data is then collected and used to update the prior 
probability following Bayes theorem to produce a posterior probability. This posterior 
probability is then a prediction in the sense that it is a statement about the likelihood 
15 of a particular event occurring in the future. However, it is not simple to design and 
implement such Bayesian statistical methods in ways that are suited to particular 
practical applications. 

It is accordingly an object of the present invention to provide a method and 
apparatus for determining one or more statistical estimators of customer behaviour, 
20 which overcomes or at least mitigates one or more of the problems noted above. 



P(H/data) = 



P{datalH)P(H) 
P{data) 
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Which can also be written as: 



Summary of the Invention 

According to an aspect of the present invention there is provided a method of 
determining one or more statistical estimators of future customer behaviour 
comprising the steps of:- 

• accessing data about past customer behaviour; 

• generating a Bayesian statistical model using the data about the past customer 
behaviour; and 

• using the model to generate one or more statistical estimators of future customer 
behaviour. 

A corresponding computer system is provided for determining one or more 
statistical estimators of future customer behaviour comprising:- 

• an input arranged to access data about past customer behaviour; 

• a processor arranged to generate a Bayesian statistical model using the data 
about the past customer behaviour; and 

• wherein said processor is further arranged to use the model to generate one or 
more statistical estimators of future customer behaviour. 

A corresponding computer program is provided for controlling a computer 
system such that one or more statistical estimators of future customer behaviour are 
determined said computer program being arranged to control the computer system 
such that:- 

• data about past customer behaviour is accessed; 

• a Bayesian statistical model is generated using the data about the past customer 
behaviour; and 

. using the model, one or more statistical estimators of future customer behaviour 
are generated. 




This provides the advantage that the statistical estimators of future customer 
behaviour are obtained and these may be used by a business, for example, to 
improve its performance. The data about past customer behaviour may comprise 
information about customer transactions such as cash machine withdrawal 



5 frequency. By using the method future customer transactions can then be predicted. 

Preferably the method further comprises accessing information about customer 
attributes and wherein said mode! is generated using the information about customer 
attributes. This gives the advantage that the model is improved and found to enable 
good statistical estimators of future customer behaviour to be produced. The 
1 0 customer attributes could be the age, sex and salary of customers for example. 

It is also preferred that the model comprises a representation of the customer 
behaviour in the form of a hidden Markov model with a random number of states. 
Moreover, it is preferred that the step of generating the model comprises clustering 
the past customer behaviour data into a plurality of states. It has unexpectedly been 
15 discovered that this type of statistical model is particularly effective for modelling 
customer behaviour data such as information about bank customers. 

Advantageously, the behaviour of each customer over time is represented as a 
path through a plurality of the states and wherein these paths are unobserved and 
are considered random. This enables the evolution of customer behaviour over time 
20 to be modelled and in this way predictions about future customer behaviour can then 
be obtained from the model. 

Preferably, each state is characterised by a random state parameter and 
preferably the model uses multi-variate customer data. That is a plurality of 
customer attributes such as age, sex and salary are used. This enables the model 
25 to be more effective for customer data and for particular applications such as 
predicting the future behaviour of bank customers. 
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Further benefits and advantages of the invention will become apparent from a 
consideration of the following detailed description given with reference to the 
accompanying drawings, which specify and show preferred embodiments of the 
invention. 

5 Brief description of the drawings 

Figure 1 is a flow diagram of a method of generating statistical estimators of 
customer behaviour. 

Figure 2 is a flow diagram showing more detail about the step of generating a 

Bayesian statistical model from Figure 1 . 
10 Figure 3 is schematic diagram of a path between states which represents a 

customer's behaviour over time. 

Figure 4 is a schematic diagram of a computer system. 

Detailed description of the invention 

Embodiments of the present invention are described below by way of 
15 example only. These examples represent the best ways of putting the invention into 

practice that are currently known to the Applicant although they are not the only 

ways in which this could be achieved. 

Consider a business such as a bank. This bank may have beliefs, 

experience and past data about customer transactions. Using this information the 
20 bank can form an assessment of the prior probability that a particular customer will 

exhibit a certain behaviour, such as leave the bank. The bank may then collect new 

data about that customer's behaviour and using Bayes' theorem can update the prior 

probability using the new observed data to give a posterior probability that the 

customer will exhibit the particular behaviour such as leaving the bank. This 
25 posterior probability is a prediction in the sense that it is a statement of the likelihood 

of an event occurring. In this way the present invention uses Bayesian statistical 




techniques to make predictions about customer behaviour. However, as mentioned 
above, it is not simple to design and implement such methods in ways that are suited 
to particular applications. The present invention involves such a method and is 
described in more detail below. 



5 Figure 1 is a flow diagram of a method of determining statistical estimators of 

customer behaviour. Data about past customer behaviour is accessed (box 10 of 
Figure 1). For example, this data comprises information about customer 
transactions such as the frequency of cash withdrawals at a Bank's ATM machines 
and the amount of money withdrawn each time. Using this data a Bayesian 

10 statistical model is generated (see box 1 1 of Figure 1 ) and this model is then used to 
generate one or more statistical estimators of future customer behaviour (box 12 of 
Figure 1). As well as data about past customer behaviour, customer attributes such 
as age, sex and salary may be used to create the model. 

The Bayesian statistical model that is used may be any suitable type of model 

15 which clusters the customer data and attributes into a finite number of states. Any 
suitable type of hidden Markov model technique may be used to achieve this. 

In this way the Bayesian statistical model represents customer behaviour 
using a plurality of states (the number of which is unknown and considered random) 
where each state is characterised by a plurality of parameters. At a given point in 

20 time a customer's behaviour is represented using one of these states; that is the 
customer's behaviour at a particular time is a member of a particular state. All 
customers within a state are assumed to have behaviour that is homogeneous in 
some way. These states may be found to correspond to particular lifestyle groups 
such as employed single people, unemployed people, students etc. However, it may 

25 well also be the case that the clusters or states generated by the model do not 
correspond to lifestyle groups or other classes that are meaningful in social terms. In 
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order to represent a customer's behaviour over time, the model uses an unobserved 
path through these states. This is illustrated schematically in Figure 3. Time 
snapshots are represented by large circles 30 and within these clusters or states are 
represented by smaller black circles 31. Arrow 32 represents time. Suppose that a 
5 particular customer has behaviour at a first time that is represented by cluster 33 of 
the left most circle 30. The behaviour of that customer over time is then represented 
as a path between a state in each time shot circle 30. For example, Figure 3 shows 
such a path 33 for a customer who changes behaviour in each time shot. Thus 
customers are considered to move through different states over time, according to 
10 state transition probabilities, as their customer data and attributes evolve. In the 
statistical model used the paths of each customer through the states over time are 
not observed and are estimated or considered random. Also, each state k is 
characterised by a random state parameter . Observed customer transactions 
whilst they are in state k are assumed to follow a parametric probability model 
15 p(Data\0 k) ). 

A particular advantage of the present invention is that the model is arranged 
to deal with customer data comprising more than one parameter or attribute per 
customer. That is, the hidden Markov model technique used is arranged to use data 
that is not univariate. For example, a plurality of attributes for each customer (e.g. 
20 age, sex, salary) are used together with transaction data such as frequency of cash 
withdrawals from ATM machines. By using data that is multivariate (as opposed to 
univariate data) the model is improved such that the results are more accurate 
predictions of customer behaviour. As described below, Robert et al. (see section 
headed "references" below for full publication details) have described use of a hidden 
25 Markov model with a random number of states, but for only one time series of 
univariate data. Also, Robert et al did not consider applying these techniques to 
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customer data such as information about transactions and withdrawals from cash 
machines. It is not obvious that clustering techniques such as hidden Markov 
models are effective at dealing with such customer data and it has unexpectedly 
been discovered that the methods described herein are effective for such data. 
5 Figure 2 is a flow diagram giving more detail about the step of generating the 

Bayesian statistical model. Bayesian prior probability distributions are specified for 
the number of states, the probabilities of a new customer starting in each state, the 
probabilities of moving between the different states and the state parameters (see 
boxes 21 to 23 of Figure 2). As already mentioned, the observed customer data is 
10 represented for each state using a parametric probability model (see box 24 of 
Figure 2). Using Bayes theorem, the Bayesian prior probability distributions, the 
accessed data and the parametric probability models are combined to generate a 
posterior probability distribution for each of: 

• the number of states; 

15 • the probabilities of a new customer starting in each state; 

• the probabilities of moving between the different states; and 

• the state parameters (see box 25 of Figure 2). 

In the case that the unobserved state paths are treated as random, posterior 
probability distributions are also generated for these unobserved state paths. 

20 The posterior probability distribution is then used to generate statistical 

estimators of future customer behaviour. For example, this may be done by using 
numerical or analytical methods to calculate the posterior probability distribution. 
Alternatively, and in a preferred embodiment, a sampling method is used to draw 
approximate random samples from the posterior distribution. Any suitable sampling 

25 method such as Gibbs sampling methods may be used. Once the samples have 
been drawn Monte Carlo inference is analysed using the samples to generate the 



statistical estimators. For example, marginal distributions and predictive densities 
can be performed. 

In the case that the customer data comprises information about transactions, 
the method gives outputs such as probabilities that particular customers will enter 
into certain transactions. For example, if the customer is a bank customer, the 
probability that a customer will leave a bank at a certain time can also be estimated. 
In this way an estimate of the lifetime value of that customer to the bank can be 
gained. 

A detailed example of the method is now described: 

Suppose there are R reference customers with whom the customer 
relationship has now ended and C current customers, and so N = R + C customers 
overall. Then for each customer / = 1,. . . , N , let /i, be the number of time units (e.g. 
weeks) over which transactions have been recorded. It is assumed that there are 
three observation types; a vector of attributes, W n that do not vary over time (e.g. 
the customer's sex); a matrix with n f columns of attributes, X i , which change over 
time but in a deterministic way (e.g. the customer's age each week); and a matrix 
with w, columns of transactions, Y. , which change over time in a non-deterministic 
way (e.g. the number of ATM visits made by a customer each week). 

The evolution of customer behaviour is represented as a hidden Markov 
model (HMM) with a random number of states as described in Robert et al (2000). 
This model says that at any point in time a customer can be described as falling into 
one of a finite number of sets, and that within states customers will behave in some 
homogenous way. The number of states n is taken to be unknown and a Bayesian 
prior distribution is assigned. One choice would be n distributed uniformly between 
il 3 ... n ) . It is not essential to assume that the number of states is uniformly 
distributed in this way. Any other suitable distribution for the number of states may 




be chosen. Each customer transaction history can then be viewed as dependent on 
an unobserved path z, of length n t through these states. 

The Markov model is completed by the specification of an nxn transition 
probability matrix P with p the probability of moving from state / to state j . State 
n is fixed to be the "end" state, representing the end of the customer relationship. 
Once entered this state cannot be left, so Pnn =1 and Pnj =o for j*n. No 
transactions can be observed in this state. 

One choice of prior distribution is to assume that, for / = , the zth row Pi 

of the matrix P follows a Dirichlet distribution with parameter vector 4*. This 
provides the choice of setting 4;. » q for j * i to make remaining in one's present 
state much more likely than moving. Write x for the stationary distribution of P , so 
the probability of being at state i at a randomly selected time is ^. 

It is not essential to use a Dirichlet distribution as described above. Any other 
suitable distribution could be used. For example, a (n-1) variate normal distribution 
that is truncated so that each element lies between 0 and 1 and so that its sum is 
less than or equal to 1 could be used. By using a Dirichlet distribution computational 
advantages are achieved and it is simple to specify that a customer has a high 
probability of staying the same state between consecutive "time shots". 

If the records of a particular customer start at a random time into the 
customer relationship, the probability of that customer being in state / when the 
records commence is 75. 

If, on the other hand, the records start at the beginning of a customer 
relationship, then the initial state of the customer might have a different probability 
distribution, as some states may be more typical than otherwise for customers with 
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whom the relationship has just commenced. Write q } for the probability of a new 
customer being in state j , j = \,...,n - 1 . For a prior distribution, again one choice 
is to assume that the vector of probabilities q = (q i ,-..,q„- x ) follows a Dirichlet 
distribution with parameter 4; . 

For each customer / , define an identifier b, which takes the value 1 if the records 
begin at the start of the customer relationship and 0 otherwise. 

Now for each customer / = 1....JV, let T t = [k\k e{l,...,n},3/' e{h...,n i }s.t.z IJ = k) 
be the set of states visited by that customer, and let S b . = e {l,. ..,«,.}, z,, = k} 
be the (possibly empty) set of time indices j which customer / spends in state k . 
Note that n e T t if and only if customer i is one of the R reference customers with 
whom the customer relationship has ended, and that S ni = {«,} for reference 
customers and S nl = 0 otherwise. 

Then for each state k define parameter vectors of length r 

model the data via suitable parametric models. If conditional independence between 
customer observations given the parameters is assumed, and if a customers' 
transactions are also assumed conditionally independent given the parameters, the 
likelihood function is then given by 

piW,X,Y,z\n,P,q, 0> = jj^ V^P^P^X^ ?\k e^ft/^l ^)}- 

/=! *= /=1 /=! [ > =1 J 




where w w = J J/k =it^. +1 =/) is the total number of times customers 
changed from state k to state /. 

5 

One choice of prior distribution of the & k) parameters which enables modelling of 
possible similarities between states through sharing common components, is to use 
a product of independent Dirichlet processes (see Ferguson, 1973; West et al, 
1994). That is, for component / = l,...r , 

10 ~DP(qF g ) 

where q is a scalar precision parameter and F t is a base prior which incorporates 
any prior beliefs that may be held about the distribution of the corresponding 
parameter component. However, it is also possible to use any other suitable prior 
distribution. 

15 

Bringing this all together, Bayes Theorem gives the posterior distribution of the 
parameters up to proportionality by 

20 where <%x) is a discrete probability mass function placing all its mass on x , and 
^ is the probability density/mass function of the distribution F t . The constant of 
proportionality is the inverse of the multiple integral of the right hand side of the 
equation above with respect to {/?, P, q, 6>z} . Analytic calculations with the posterior 
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distribution are therefore complex. In a preferred embodiment, Markov Chain Monte 
Carlo (MCMC) simulation is used to draw approximate random samples from the 
posterior distribution for making parameter inference and prediction. However, this 
is not essential, any other suitable numerical method or analytic methods of 
5 calculating the posterior distribution may be used. 

In a preferred embodiment, MCMC simulation is used as described above. For 
example, Gibbs sampling techniques are used. The Gibbs sampler is a MCMC 
technique for generating from the posterior distribution of a set of model parameters 
10 via the full conditional distributions. For a description of the Gibbs sampler and full 
conditional distributions see Smith and Roberts (1993). Two methods using Gibbs 
sampling are combined here. 

The first was described by Robert et al (2000) for a HMM with a random number of 
15 states, but for only one time series of univariate data; the vector parameters 

| <?'>,..., 0 a) } are thus replaced by scalar parameters { c+ a) } ■ Because the 

number of states n is considered random, the MCMC Reversible jump methods of 
Green (1995) are required to explore the variable dimension parameter space. The 
jump moves described by Robert et al (2000) are used here to change the number of 
20 dimensions, with the only change that methods for deleting or adding a J k) 
parameter are here performed identically for each component of ^* } in turn. The 
Dirichlet process prior across states for corresponding components 
provides the advantage that two states that are to be merged have positive 
probability of already sharing common components and thus such a move will be 
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more likely to be accepted. The Gibbs moves for z and P (and here q) are 
identical to those described by Robert et al (2000). 



To create a Gibbs move for the parameters j^ 0 ,..., 0 n) j conditional on {/i,P,z}, 

the Gibbs sampling strategy of MacEachern (1992) for Dirichlet processes is 
implemented. However it is not essential to use this particular Gibbs sampling 
strategy. Any other suitable sampling methods can be used. 

Once a large approximate sample from the posterior distribution 
[n,P,q, %z} (l \...,{n,P 9 q, %z} (M) has been collected, Monte Carlo inference about 
aspects of the posterior distribution such as marginal distributions and predictive 
densities can be performed. Thus predictions of customer transactions, how long 
the customer relationship will last and their lifetime value are all readily available. 

The method described herein may be implemented using any suitable 
programming language executed on any suitable computing platform. For example, 
Matlab (trade mark) may be used together with a personal computer. A user 
interface is provided such as a graphical user interface to allow an operator to 
control the computer program, for example, to adjust the model, to display the 
results and to manage input of customer data. Any suitable form of user interface 
may be used as is known in the art. 

Figure 4 is a schematic diagram of a computer system for generating 
statistical estimators of future customer behaviour. Data about past customer 
behaviour 42 is input to a processor 43 via an input 41 . The processor uses this 
data to generate a Bayesian statistical model and using this model to generate 
statistical estimators 44 of future customer behaviour. 



A range of applications are within the scope of the invention. These include 
situations in which it is required to determine one or more statistical estimators of 
customer behaviour. For example, to estimate the probability that a particular 
customer of a business will stop being a customer (for example by leaving a bank) at 
a specified time in the future or to estimate the frequency and nature of future 
customer transactions. Using such estimates the lifetime value of particular 
customers to a business can be estimated. 
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Claims 

1 . A method of determining one or more statistical estimators of future customer 
behaviour comprising the steps of:- 

(i) accessing data about past customer behaviour; 

(ii) generating a Bayesian statistical model using the data about the past 
customer behaviour; and 

(iii) using the model to generate one or more statistical estimators of future 
customer behaviour. 

2. A method as claimed in claim 1 which further comprises accessing 
information about customer attributes and wherein said model is generated 
using the information about customer attributes. 

3. A method as claimed in claim 1 or claim 2 wherein the model comprises a 
representation of the customer behaviour in the form of a hidden Markov 
model. 

4. A method as claimed in claim 3 wherein said hidden Markov model has a 
random number of states. 

5. A method as claimed in claim 1 wherein said step of generating the model 
comprises clustering the past customer behaviour data into a plurality of 
states. 

6. A method as claimed in claim 5 wherein the behaviour of each customer over 
time is represented as a path through a plurality of the states and wherein 
these paths are unobserved and are considered random. 

7. A method as claimed in any of claims 4 to 6 wherein each state is 
characterised by a plurality of random state parameters. 
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A method as claimed in claim 7 wherein past data about a customer's 
behaviour whilst that customer is in a particular state is assumed to follow a 
parametric probability model. 

A method as claimed in any preceding claim wherein said step of generating 
the Bayesian statistical model comprises specifying a plurality of Bayesian 
prior probability distributions. 

A method as claimed in claim 9 wherein said step of generating the model 
further comprises generating a plurality of Bayesian posterior probability 
distributions on the basis of at least the plurality of Bayesian prior probability 
distributions and the past customer data. 

A method as claimed in claim 1 wherein said step (iii) of using the model to 
generate one or more statistical estimators comprises the step of using a 
sampling method to draw approximate random samples from the posterior 
distribution and performing Monte Carlo inference using the samples to 
generate the statistical estimators. 

A method as claimed in claim 1 wherein said step (iii) of using the model to 
generate one or more statistical estimators comprises the step of numerically 
or analytically calculating the Bayesian posterior probability distributions. 
A method as claimed in any preceding claim wherein the statistical estimators 
comprise a probability that a customer will exhibit a certain behaviour. 
A method as claimed in any of claims 1 to 12 wherein the statistical 
estimators comprise the most probable behaviour exhibited by customers. 
A method as claimed in any preceding claim wherein the past customer data 
comprises information about customer transactions. 

A computer system for determining one or more statistical estimators of 
future customer behaviour comprising:- 
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(i) an input arranged to access data about past customer behaviour; 

(ii) a processor arranged to generate a Bayesian statistical model using the data 
about the past customer behaviour; and 

(iii) wherein said processor is further arranged to use the model to generate one 
5 or more statistical estimators of future customer behaviour. 

17. A computer system as claimed in claim 16 wherein said data about past 
customer behaviour comprises customer attributes. 

18. A computer system as claimed in claim 16 or claim 17 wherein the processor 
is arranged to generate the model by clustering the past customer behaviour 

1 o data into a plurality of states. 

19. A computer program for controlling a computer system such that one or more 
statistical estimators of future customer behaviour are determined said 
computer program being arranged to control the computer system such that:- 

(i) data about past customer behaviour is accessed; 
15 (ii) a Bayesian statistical model is generated using the data about the past 
customer behaviour; and 
(iii) using the model, one or more statistical estimators of future customer 
behaviour are generated. 

20. A computer program as claimed in claim 19 wherein said data about past 
20 customer behaviour comprises customer attributes. 

21. A computer program as claimed in claim 19 or claim 20 which is arranged to 
control the computer system such that the processor generates the model by 
clustering the past customer behaviour data into a plurality of states. 
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ABSTRACT 

Method and apparatus for determining one or more statistical estimators of 
customer behaviour 

Businesses typically have large amounts of data about customer transactions and 
5 other customer information which is not fully utilised. The present invention provides 
a means of using this information to make predictions about future customer 
behaviour, for example by estimating the probability that a customer will leave a 
bank. Using these predictions the business is able to take action in order to improve 
its performance. Using customer data a Bayesian statistical model is generated and 

10 this model used to generate statistical estimators of customer behaviour. The 
statistical model is formed using hidden Markov model techniques by clustering 
customer data and attributes (e.g. Age, sex, salary) into a finite number of states. 
The number of states is unobserved and considered random. Bayesian prior 
probability distributions are specified and combined with the data to produce 

15 Bayesian posterior probability distributions. Using these Bayesian posterior 
probability distributions the statistical estimators are obtained. For example, Monte 
Carlo sampling techniques are used or alternatively the posterior distributions are 
calculated numerically or analytically. 
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Figure 1 



Specify Bayesian prior probability distributions for a plurality of 
states ^ 



Specify Bayesian prior probability distributions for the probabilities 
of a new customer starting in each state 



Specify Bayesian prior probability distributions for the 
probabilities of moving between states 



Represent observed customer data for each state using a 
parametric probability model 



Combine the Bayesian prior probability distributions the 
parametric probability models and the accessed data to 
generate a posterior probability distribution for each of: 
the number of states, the probabilities of a new customer 
starting in each state, the probabilities of moving between the 
different states and the state parameters (and the unobserved 
state paths if treated as random). 
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Figure 3 
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Figure 4 
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